Aligning Articles in TV Newscasts and Newspapers
نویسندگان
چکیده
It is important to use pattern information (e.g. TV newscasts) and textual information (e.g. newspapers) together. For this purpose, we describe a method for aligning articles in TV newscasts and newspapers. In order to align articles, the alignment system uses words extracted from telops in TV newscasts. The recall and the precision of the alignment process are 97% and 89%, respectively. In addition, using the results of the alignment process, we develop a browsing and retrieval system for articles in TV newscasts and newspapers. 1 I n t r o d u c t i o n Pat tern information and natural language information used together can complement and reinforce each other to enable more effective communication than can either medium alone (Feiner 91) (Nakamura 93). One of the good examples is a TV newscast and a newspaper. In a TV newscast, events are reported clearly and intuitively with speech and image information. On the other hand, in a newspaper, the same events are reported by text information more precisely than in the corresponding TV newscast. Figure 1 and Figure 2 are examples of articles in TV newscasts and newspapers, respectively, and report the same accident, that is, the airplane crash in which the Commerce Secretary was killed. Ilowever, it is difficult to use newspapers and TV newscasts together without aligning articles in the newspapers with those in the TV newscasts. In this paper, we propose a method for aligning articles in newspapers and TV newscasts. In addition, we show a browsing and retrieval system for aligned articles in newspapers and TV newscasts. 2 T V N e w s c a s t s a n d N e w s p a p e r s 2.1 T V N e w s c a s t s In a TV newscast, events are generally reported in the following modalities: • image information, • speech information, and • text information (telops). In TV newscasts, the image and the speech information are main modalities. However, it is difficult to obtain the precise information from these kinds of modalities. The text information, on the other hand, is a secondary modality in TV newscasts, which gives us: • explanations of image information, • summaries of speech information, and • information which is not concerned with the reports (e.g. a time signal). In these three types of information, the first and second ones represent the contents of the reports. Moreover, it is not difficult to extract text information from TV newscasts. It is because a lots of works has been done on character recognition and layout analysis (Sakai 93) (Mino 96) (Sato 98). Consequently, we use this textual information for aligning the TV newscasts with the corresponding newspaper articles. The method for extracting the textual information is discussed in Section 3.1. But, we do not treat the method of character recognition in detail, because it is beyond the main subject of this study. 2.2 Newspapers A text in a newspaper article may be divided into four parts: • headline, • explanation of pictnres, • first paragraph, and • the rest. In a text of a newspaper article, several kinds of information are generally given in important order. In other words, a headline and a first paragraph in a newspaper article give us the most important information. In contrast to this, the rest in a newspaper article give us the additional information. Consequently, headlines and first paragraphs contain more significant words (keywords) for representing the contents of the article than the rest.
منابع مشابه
US news media coverage of tobacco control issues.
OBJECTIVE To characterise the relative amount and type of daily newspaper, local and national TV newscast, and national news magazine coverage of tobacco control issues in the United States in 2002 and 2003. DESIGN Content analysis of daily newspapers, news magazines, and TV newscasts. SUBJECTS Items about tobacco in daily newspapers, local and national TV newscasts, and three national news...
متن کاملComparative Study of Local and National Media Reporting: Conflict around the TV Oak in Stockholm, Sweden
The TV oak (Television Oak) conflict concerned felling an old tree in a wealthy area of Stockholm. The case received great public attention in different media formats with different scopes (e.g., newspapers, television, internet). The TV Oak issue involved actors with different, partly conflicting perceptions. Assuming that the relevance of urban tree management issues in particular leads to in...
متن کاملOlder Adults as Discursively Constructed in Taiwanese Newspapers: A Critical Discourse Analysis
This paper uses critical discourse analysis to examine discursive representations of older people in Taiwanese newspapers. A total of 926 references to older people were sampled from 62 articles published in four Taiwanese newspapers from January to August 2013. The findings suggest that, older people were frequently allocated roles suggestive of dependency. Those portrayed in line with the pos...
متن کاملHow French media have portrayed ADHD to the lay public and to social workers
Two models of attention deficit hyperactivity disorder (ADHD) coexist: the biomedical and the psychosocial. We identified in nine French newspapers 159 articles giving facts and opinions about ADHD from 1995 to 2015. We classified them according to the model they mainly supported and on the basis of what argument. Two thirds (104/159) mainly supported the biomedical model. The others either def...
متن کاملAnalyzing Iran Daily and US Today in Terms of Meta-Discourse Elements
The role of using meta-discourse elements in writing, especially in research newspapers, is so important that their authors can convey certainty, doubt, and characteristics of the writers in their writings. There are different meta-discourse markers used by various authors in different branches; for example, hedges and boosters are the most important devices in writing. The meta-discourse eleme...
متن کامل